Arpeggio: Metadata Indexing in a Structured Peer-to-Peer Network
نویسندگان
چکیده
Peer-to-peer networks require an efficient means for performing searches for files by metadata keywords. Unfortunately, current methods usually sacrifice either scalability or recall. Arpeggio is a peer-to-peer file-sharing network that uses the Chord lookup primitive as a basis for constructing a distributed keyword-set index, augmented with index-side filtering, to address this problem. We introduce index gateways, a technique for minimizing index maintenance overhead. Arpeggio also includes a content distribution system for finding source peers for a file; we present a novel system that uses Chord subrings to track live source peers without the cost of inserting the data itself into the network, and supports postfetching : using information in the index to improve the availability of rare files. The result is a system that provides efficient query operations with the scalability and reliability advantages of full decentralization. We use analysis and simulation results to show that our indexing system has reasonable storage and bandwidth costs, and improves load distribution. Thesis Supervisor: David R. Karger Title: Professor of Electrical Engineering and Computer Science
منابع مشابه
Arpeggio: Metadata Searching and Content Sharing with Chord
Arpeggio is a peer-to-peer file-sharing network based on the Chord lookup primitive. Queries for data whose metadata matches a certain criterion are performed efficiently by using a distributed keywordset index, augmented with index-side filtering. We introduce index gateways, a technique for minimizing index maintenance overhead. Because file data is large, Arpeggio employs subrings to track l...
متن کاملTAC: A Topology-Aware Chord-based Peer-to-Peer Network
Among structured Peer-to-Peer systems, Chord has a general popularity due to its salient features like simplicity, high scalability, small path length with respect to network size, and flexibility on node join and departure. However, Chord doesn’t take into account the topology of underlying physical network when a new node is being added to the system, thus resulting in high routing late...
متن کاملA Query-Adaptive Partial Distributed Hash Table for Peer-to-Peer Systems
The two main approaches to find data in peer-to-peer (P2P) systems are unstructured networks using flooding and structured networks using a distributed index. A distributed index is usually built over all keys that are stored in the network whether they are queried or not. Indexing all keys is no longer feasible when indexing metadata, as the key space becomes very large. Here we need a query-a...
متن کاملA Novel Caching Strategy in Video-on-Demand (VoD) Peer-to-Peer (P2P) Networks Based on Complex Network Theory
The popularity of video-on-demand (VoD) streaming has grown dramatically over the World Wide Web. Most users in VoD P2P networks have to wait a long time in order to access their requesting videos. Therefore, reducing waiting time to access videos is the main challenge for VoD P2P networks. In this paper, we propose a novel algorithm for caching video based on peers' priority and video's popula...
متن کاملIntegrating RDF Querying Capabilities into a Distributed Search Infrastructure
The Semantic Web is inherently distributed, and covers both metadata and full-text information. Semantic search therefore can profit a lot from peer-to-peer infrastructures as well as from powerful metadata search functionalities based on full-text search technologies. In this paper we focus on an approach extending an existing P2P search infrastructure with RDF querying capabilities, which bot...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007